【Hackathon 5th No.2】为 Paddle 新增 index_fill API RFC #621

Patrick-Star125 · 2023-09-14T04:20:16Z

新增 index_fill API 设计文档

paddle-bot · 2023-09-14T04:20:39Z

你的PR提交成功，感谢你对开源项目的贡献!
请检查PR提交格式和内容是否完备，具体请参考示例和模版。
Your PR has been submitted. Thanks for your contribution!
Please check its format and content. For this, you can refer to Template and Demo.

zoooo0820 · 2023-09-14T09:16:23Z

rfcs/APIs/20230914_api_design_for_index_fill.md

+- index (Tensor) - 包含索引的一维张量，可以为int32和int64
+- value (float) - 张量填充的值，可以为int32, int64, float32, float64
+- name  (str) - 具体用法请参见 [Name](https://www.paddlepaddle.org.cn/documentation/docs/zh/api_guides/low_level/program.html#api-guide-name)，一般无需设置，默认值为 None。
+


x作为Tensor，其取值、赋值，理论上不应与dtype有关，如无特殊情况应支持所有dtype。由于这里依赖其他api，如果是由于其他API的问题，可以特别指出一下

axis参数的默认行为设计，辛苦先补充一下其他竞品的行为作为对比。如果是参考Paddle其他类似API，可以在现状章节介绍一下

value同x，理论应当支持所有dtype，此外需要考虑下是否支持0-dtensor

zoooo0820 · 2023-09-14T09:20:42Z

rfcs/APIs/20230914_api_design_for_index_fill.md

+return paddle.reshape(out, x_dim_vec)
+~~~
+
+索引的遍历参考了cummax/cummin算子的CPU实现，[链接](https://github.com/PaddlePaddle/Paddle/pull/53546/files#diff-0417a927e0148c22ecb722f950e2f9704d6e899e9899521f0a269b173ceb2de2)


整体看起来，这段代码由于是在Python端执行两层循环，是否会有性能问题

index_put 的索引是按轴顺序索引，是否可以对比下transpose + index_put的方式，初步判断下两者的性能差异

我理解的transpose + index_put方式是通过transpose的方式将指定axis放到最前，由此将需要覆盖的元素聚集，取出下标后再放入index_put函数，得到结果后再transpose回原来的形状，下面的代码实现了一部分逻辑：

arr = np.random.random((4, 3, 2)).astype('float64') pd_arr = paddle.to_tensor(arr) tor_arr = torch.tensor(arr) index = [0, 2] print(paddle.transpose(pd_arr, perm=[0, 1, 2])) print(torch.transpose(torch.index_fill(tor_arr, 0, torch.tensor(index), -1), 0, 0)) print(paddle.transpose(pd_arr, perm=[1, 0, 2])) print(torch.transpose(torch.index_fill(tor_arr, 1, torch.tensor(index), -1), 0, 1))

因此可以粗略估计性能差异：

假设tensor数据量为N，rank为R，size为ndim，index元素个数为L，再假设reshape、flatten、transpose形状转变函数复杂度为O(N)，index_put计算量相同都设为O(P)

当前方式：一次flatten、一次reshape，索引扫描L次，每次扫描数据量为(N/ndim[axis])=S，因此总时间复杂度为：2*O(N)+L*O(N/ndim[axis])+O(P)

transpose + index_put方式：两次transpose，构造R个索引（按轴），每个索引构造时间复杂度L*O(N/ndim[axis])，因此总时间复杂度为：2*O(N)+(R-1+L)*L*O(N/ndim[axis])+O(P)

因为要构造索引所以python端循环并没有减少，反而更多，我的理解是循环遍历N/ndim[axis]*L个索引位置（构造数组）是难以避免的，通过展开的方式可以将构造索引位置数组的开销由R次降到1次，可能更优。

但如果说transpose的实现比reshape、flatten更高效，那可能是transpose + index_put的方式更优。

1.时间复杂度的分析是OK的，但是考虑到同样的操作在python/c++，cpu/gpu执行的效率是不同的，最好拿最后的时间数据进行比较。
2.目前transpose操作默认是启用stride机制，输出原Tensor的view，对view进行inplace改会直接体现在原Tensor上，理论上Inplace版本的操作不会再需要transpose回来。

辛苦简单实现下，用少量case对比下时间数据呢

zoooo0820 · 2023-09-14T09:23:05Z

rfcs/APIs/20230914_api_design_for_index_fill.md

+ - value (float) – the value to fill with
+```
+输入用于定位的dim和index，原地修改tensor对应位置的值为value
+


可以细化一下这个API的情况，例如支持的dtype；index是否有rank要求；value是否支持0-d tensor，complex类型等等；

zoooo0820 · 2023-09-15T03:30:09Z

rfcs/APIs/20230914_api_design_for_index_fill.md

+
+## 命名与参数设计
+
+API设计为`paddle.index_fill(x, axis, index, value, name)`以及`paddle.index_fill_(x, axis, index, value, name)`。


从相似api，与index_add / index_select的统一的角度出发，建议index在前，axis在后

zoooo0820 · 2023-09-15T03:41:28Z

rfcs/APIs/20230914_api_design_for_index_fill.md

+return paddle.reshape(out, x_dim_vec)
+~~~
+
+索引的遍历参考了cummax/cummin算子的CPU实现，[链接](https://github.com/PaddlePaddle/Paddle/pull/53546/files#diff-0417a927e0148c22ecb722f950e2f9704d6e899e9899521f0a269b173ceb2de2)


1.时间复杂度的分析是OK的，但是考虑到同样的操作在python/c++，cpu/gpu执行的效率是不同的，最好拿最后的时间数据进行比较。
2.目前transpose操作默认是启用stride机制，输出原Tensor的view，对view进行inplace改会直接体现在原Tensor上，理论上Inplace版本的操作不会再需要transpose回来。

辛苦简单实现下，用少量case对比下时间数据呢

Patrick-Star125 · 2023-09-16T11:39:46Z

对上面两种方法简单测试，case input大小为[400, 300, 20]，计算十次，计算时间间隔，flatten+index_input和transpose+index_input都是2s左右，我尝试transpose之后直接在tensor维度上赋值，测试只需0.05s左右，几乎两个数量级的提升，代码也比较简单，请问这个思路可行吗，刚刚提交了一版代码：PaddlePaddle/Paddle#57416

zoooo0820 · 2023-09-18T02:50:36Z

对上面两种方法简单测试，case input大小为[400, 300, 20]，计算十次，计算时间间隔，flatten+index_input和transpose+index_input都是2s左右，我尝试transpose之后直接在tensor维度上赋值，测试只需0.05s左右，几乎两个数量级的提升，代码也比较简单，请问这个思路可行吗，刚刚提交了一版代码：PaddlePaddle/Paddle#57416

@Patrick-Star125 就使用快的这个方案就行，辛苦也修改下RFC文档的方案部分，合入之后再review下开发代码的PR

Patrick-Star125 · 2023-09-19T07:52:50Z

已修改

zoooo0820 · 2023-09-19T12:27:00Z

rfcs/APIs/20230914_api_design_for_index_fill.md

+if in_dynamic_mode():
+    out[index] = value
+else:
+    out = paddle.static.setitem(out, index, value)


这个地方建议直接调index_put或者index_put_，索引解析和分发到具体OP还有额外逻辑，会增加耗时。

zoooo0820

LGTM

Patrick-Star125 and others added 2 commits September 14, 2023 12:17

init API design of index_fill

bfba0f5

Merge branch 'PaddlePaddle:master' into master

7d494b7

paddle-bot bot added the contributor label Sep 14, 2023

Ligoml mentioned this pull request Sep 14, 2023

【PaddlePaddle Hackathon 5th】开源贡献个人挑战赛 PaddlePaddle/Paddle#57262

Open

luotao1 assigned luotao1 and zoooo0820 Sep 14, 2023

zoooo0820 reviewed Sep 14, 2023

View reviewed changes

Patrick-Star125 added 4 commits September 14, 2023 21:13

fix some problems

a413e6b

Merge branch 'master' of https://github.com/PaddlePaddle/community

4117458

Merge branch 'master' of https://github.com/Patrick-Star125/community

25f0592

fix some problems

400cfb1

zoooo0820 reviewed Sep 15, 2023

View reviewed changes

Patrick-Star125 mentioned this pull request Sep 16, 2023

【Hackathon 5 No.2】Add index_fill / index_fill_ API to Paddle -part PaddlePaddle/Paddle#57416

Merged

2 tasks

adjust API design

d3b6afd

zoooo0820 reviewed Sep 19, 2023

View reviewed changes

adjust the rfc design

e21327a

zoooo0820 approved these changes Sep 20, 2023

View reviewed changes

zoooo0820 merged commit a50aa52 into PaddlePaddle:master Sep 20, 2023
1 check passed

luotao1 mentioned this pull request Sep 25, 2023

【Hackathon 5th No.2】为 Paddle 新增 index_fill API #630

Closed

Patrick-Star125 mentioned this pull request Sep 27, 2023

【Hackathon 5th No.14】Add combinations API to Paddle PaddlePaddle/Paddle#57792

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

【Hackathon 5th No.2】为 Paddle 新增 index_fill API RFC #621

【Hackathon 5th No.2】为 Paddle 新增 index_fill API RFC #621

Patrick-Star125 commented Sep 14, 2023

paddle-bot bot commented Sep 14, 2023

zoooo0820 Sep 14, 2023

Patrick-Star125 Sep 14, 2023

zoooo0820 Sep 14, 2023

Patrick-Star125 Sep 14, 2023 •

edited

Loading

zoooo0820 Sep 15, 2023

zoooo0820 Sep 14, 2023

Patrick-Star125 Sep 14, 2023

zoooo0820 Sep 15, 2023

zoooo0820 Sep 15, 2023

Patrick-Star125 commented Sep 16, 2023 •

edited

Loading

zoooo0820 commented Sep 18, 2023

Patrick-Star125 commented Sep 19, 2023

zoooo0820 Sep 19, 2023

Patrick-Star125 Sep 20, 2023

zoooo0820 left a comment


		## 命名与参数设计

		API设计为`paddle.index_fill(x, axis, index, value, name)`以及`paddle.index_fill_(x, axis, index, value, name)`。

【Hackathon 5th No.2】为 Paddle 新增 index_fill API RFC #621

【Hackathon 5th No.2】为 Paddle 新增 index_fill API RFC #621

Conversation

Patrick-Star125 commented Sep 14, 2023

paddle-bot bot commented Sep 14, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Patrick-Star125 Sep 14, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Patrick-Star125 commented Sep 16, 2023 • edited Loading

zoooo0820 commented Sep 18, 2023

Patrick-Star125 commented Sep 19, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zoooo0820 left a comment

Choose a reason for hiding this comment

Patrick-Star125 Sep 14, 2023 •

edited

Loading

Patrick-Star125 commented Sep 16, 2023 •

edited

Loading